Distance Measures for Geometric Graphs

FWCG, 2022

Sushovan Majhi, UC Berkeley
Carola Wenk, Tulane University

Motivation

  • Graphs representaion facilitates easy desciption of relational patterns
  • Pattern Recognition becomes the problem of measuring similarity between two graphs
  • A distance measure is deemed meaningful if
    • small distance implies similarity
    • large distance reveals dissimilarity
  • Graph comaprison has been widely studied both by the Pattern Recognition and Database community.

This Morning’s Agenda

  • What is a Geometric Graph?
  • Where are they?
  • Why defining a similarity measure for them is challenging?
  • The two friends: GED and GGD
  • How the two measures compare?
  • Computational complexity results
  • Questions?

Geometric Graphs \(\mathbb{R}^d\)

A combinatorial graph \(G=(V,E)\) is called geometric if

  • \(V\subset\mathbb{R}^d\)
  • An edge \((u,v)\in E\) is identified with the line segment \(\overline{uv}\)
  • edges intersect only at their endpoints

Applications

Letter Recognition

IAM (Riesen and Bunke 2008)

Applications

Map Comaprison

Source: mapconstruction.org (Ahmed et al. since 2013)

How to Define Distance between Two Graphs?

Given two geometric graphs \(G, H\):

  • check if they are (combinatorial) isomorphic
    • not robust
    • not lenient to local, minor defects
  • correct small defects, but at a small cost

Geometric Edit Operations

  • deletion (or insertion) of a vertex does not cost
  • deletion (or insertion) of an edge costs \(C_E\) times its length
  • translation of a vertex costs \(C_V\) times the displacment plus \(C_E\) times the total length changes in the incident edges.

An Example Edit Path

Geometric Edit Distance (GED)

\[GED(G,H):=\inf cost(P),\] where the infimum is taken over all edit paths from \(G\) to \(H\).

GED is a metric for positive \(C_V, C_E\).

Cons

  • not computationally feasible
  • infinitely many edit paths
  • uncountable alphabet

Geometric Graph Distance (GGD)

Inexact Matching Correspondence

  • define \(\pi\)

\[ \cost(\pi)= \underbrace{\sum_{\substack{u\in V^G \\ \pi(u)\neq\epsilon_V}} C_V\lvert u-\pi(u)|}_\text{vertex translations} + \underbrace{\sum_{\substack{e\in E^G \\ \pi(e)\neq\epsilon_E}} C_E\big||e|-|\pi(e)|\big|}_\text{edge translations} + \underbrace{\sum_{\substack{e\in E^G \\ \pi(e)=\epsilon_E}} C_E|e|} _\text{edge deletions} + \underbrace{\sum_{\substack{f\in E^H \\ \pi^{-1}(f)=\epsilon_E}} C_E|f|} _\text{edge deletions} \]

\[ \ggd(G,H)\eqdef\min_{\pi\in\Pi(G,H)}\cost(\pi). \]

  • GGD is a metric.

Comparing GED and GGD

\[ \ggd(G,H)\leq\ged(G,H)\leq\left(1+\Delta\frac{C_E}{C_V}\right)\ggd(G,H) \]

Compexity of GGD

Known Results

  • NP-hard for highly non-planar instances
  • NP-hard for planar if \(C_V<<C_E\)

Our Results

  • NP-hard even if (i) planar, (ii) \(C_V,C_E\) arbitrary

Further Questions

  • APX-hardness
  • PSAT
  • How to choose \(C_V, C_E\), machine learning??

Thanks

Say hi: smajhi@berkeley.edu

Ahmed, M., S. Karagiorgou, D. Pfoser, and C. Wenk. since 2013. “Map Construction Portal.” mapconstruction.org.
Cheong, Otfried, Joachim Gudmundsson, Hyo-Sil Kim, Daria Schymura, and Fabian Stehn. 2009. “Measuring the Similarity of Geometric Graphs.” In Experimental Algorithms, edited by Jan Vahrenhold, 5526:101–12. Springer. https://doi.org/10.1007/978-3-642-02011-7_11.
Riesen, Kaspar, and Horst Bunke. 2008. IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning.” In Structural, Syntactic, and Statistical Pattern Recognition, 5342:287–97. Springer. https://doi.org/10.1007/978-3-540-89689-0_33.